Big Data and Visual Analytics by Sang C. Suh & Thomas Anthony
Author:Sang C. Suh & Thomas Anthony
Language: eng
Format: epub
Publisher: Springer International Publishing, Cham
3 Statistical Similarity Based Data Compression
Conventional lossy coding schemes in general quantize or threshold data to adjust quality and reduce data size [51]. Their goal is to compress data without compromising distinctive attributes of data. However, the tenets of these conventional schemes thus far have restricted their attention to the recovery of signal where distortion (distance) is measured using Euclidean distance such as sum of squared error (SSE) and signal-to-noise ratio (SNR) [11, 29, 48, 51]. Specifically, using Euclidean distance as the distance measure requires the sequence of encoded and decoded data to be preserved.
Employing the concept of random variable introduces a new way of signal recovery: data is reconstructed from a learned probability distribution during the encoding process, not from encoded (quality-adjusted) data itself. Thus, encoded output is not a direct representation of original data; instead, the encoder informs the decoder how to regenerate them. If we relax the constraint of preserving the sequence of encoded and decoded data, and treat a sequence of data as if it originates from a random variable, we can achieve a superior compression ratio.
This work presents a new class of compression scheme based on statistical similarity, dubbed IDEALEM (Implementation of Dynamic Extensible Adaptive Locally Exchangeable Measures) [28], that parts with conventional Euclidean distance measure and instead focuses on the exchangeability of similar data sequences [36]. In particular, this flexibility/relaxation on the order of data sequence yields much higher compression ratios.
Of course, application data could not be explained by random numbers. However, in some situations, devices such as sensors might be measuring background noise during the majority of their operation time. In these cases, faithfully reproducing the random noise is not necessary.
Download
This site does not store any files on its server. We only index and link to content provided by other sites. Please contact the content providers to delete copyright contents if any and email us, we'll remove relevant links or contents immediately.
Modelling of Convective Heat and Mass Transfer in Rotating Flows by Igor V. Shevchuk(6356)
Weapons of Math Destruction by Cathy O'Neil(6087)
Factfulness: Ten Reasons We're Wrong About the World – and Why Things Are Better Than You Think by Hans Rosling(4631)
A Mind For Numbers: How to Excel at Math and Science (Even If You Flunked Algebra) by Barbara Oakley(3191)
Descartes' Error by Antonio Damasio(3190)
Factfulness_Ten Reasons We're Wrong About the World_and Why Things Are Better Than You Think by Hans Rosling(3166)
TCP IP by Todd Lammle(3101)
Applied Predictive Modeling by Max Kuhn & Kjell Johnson(2988)
Fooled by Randomness: The Hidden Role of Chance in Life and in the Markets by Nassim Nicholas Taleb(2978)
The Tyranny of Metrics by Jerry Z. Muller(2957)
The Book of Numbers by Peter Bentley(2878)
The Great Unknown by Marcus du Sautoy(2618)
Once Upon an Algorithm by Martin Erwig(2543)
Easy Algebra Step-by-Step by Sandra Luna McCune(2543)
Lady Luck by Kristen Ashley(2498)
Practical Guide To Principal Component Methods in R (Multivariate Analysis Book 2) by Alboukadel Kassambara(2450)
Police Exams Prep 2018-2019 by Kaplan Test Prep(2444)
All Things Reconsidered by Bill Thompson III(2330)
Linear Time-Invariant Systems, Behaviors and Modules by Ulrich Oberst & Martin Scheicher & Ingrid Scheicher(2309)
